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1. Introduction 

Simple cognitive functions of the brain, like associative recall of memories, have been 
studied for 15 years by using models of attractor networks. One of these is the Hopfield- 
Little model which was analyzed by means of statistical mechanics 0] because of 
the similarity to spin models. To get a more realistical description of properties in 
data processing, it was extended to models with variable pattern activity [||]. In this 
context a network with neurons Si 6 {0, 1}, instead of { — 1, 1} was studied in f|, |7j and 
showed enhanced storage capacity, resembling the upper bound obtained by Gardner 
f|. To account for the low connectivity in the brain, this model was extended to random 
dilution of synapses and analyzed with a dynamical approach ||. The one step parallel 
dynamics of a network can be solved exactly, using a probabilistic description |T(J with 



a restriction to the first time step or very high dilution due to feed back loops. 
More recently this method was used to characterize the influence of the threshold on 



the retrieval properties 12, 13 



Following this work, we characterize the state of the network by two overlaps with one 
condensed pattern. We use the probabilistic description of the time evolution of these 
overlaps to derive conditions for their improvement on the storage capacity. We also 



consider the mutual information content 12] and the restriction on the threshold in 



the (Q, a)-plane for optimal performance. The effect of positive temperature is studied, 
leading to an exact expression for the critical value with the corresponding threshold, 
and an approximation for the time evolution for small temperatures. The analysis is 
done as exactly as possible, covering the whole range of the pattern activity and the 
overlaps from a dynamical point of view. In this way we can rederive some former results 
by looking at special limits, like small pattern activity and the network near retrieval. 
The main purpose is to demonstrate the possibilities of this relatively simple approach 
and therefore we study a simple network with realistic features. The most important 
considerations are confirmed by simulations. 

In the next two subsections we will introduce the model and the theoretical description. 
In chapter 2 we will study restrictions on the pattern loading and the threshold for good 
retrieval properties at zero temperature. The effect of noisy updating is analyzed in 
chapter 3 and in section 4 we summarize our results. 

1.1. The model 

The network consists of N neurons Si 6 {0, 1}, i = 1, . . . , iV and the state is described 
by S = (Si, . . . , 5jv). There are p patterns v — 1, . . . ,p, with elements e {0, 1}, 
i = 1, . . . , N. They are stored by using a modified Hebb rule (see and |5|, |(J): 

The £f are independent, identically distributed random variables (IIDRV) with the 
distribution function = a<5(C — 1) + (1 — a)5(^), a G [0, 1]. Therefore the activity 
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of every pattern 

1 N 

^:=(0, = ^E^ G [0,1] 
i=i 

is a random variable with mean a and variance j^a(l — a) for all patterns z/ G {1, . . . ,p}. 
The G {0, 1} are also IIDRV with the distribution P(cij) = c5(cij — 1) + (1 — c)5(cij), 
c G [0, 1]. Hence the number of connections per neuron % over the number of neurons 

1 N 

C l /N:=( Cl] ) 3 = -Y.c t3 G [0,1] 
is again a random variable with mean c and variance j^c(l — c) for all neurons 

ie{l,...,N}. 

The normalization factor (ca(l — a))" 1 in the definition of the synapses turns out to be 
useful to keep the local field in the same order of magnitude over the whole range of pat- 
tern activity and network connectivity. As usual we have J a = for alH G {1, . . . , ./V}, 
enforced by the factor (1 — Sij). 

The neurons are updated parallel in discrete time steps t G N and a uniform threshold 
Q, which may be time dependent, is subtracted from the local field 

N 

Kit) : y.-j^ih. 

3=1 

We have the usual activation function g(x,/3) = (1 +e~ 2 ^ x )~ 1 , with the noise parameter 
(5 = T _1 and temperature T, giving the update rule: 

P(Si(t + l) = l)=g(hi(t)-Q,P) 

P(Si(t + 1) = 0) = 1 - g(h t (t) - Q, (3) (2) 

For the noiseless case the activation function reduces to the Heavyside step function, 
g(x,P) — * 6(x) for (3 — * oo, and the update rule is: 

1.2. Theoretical description 

We describe the network theoretically in the thermodynamic limit N, p — > oo, a = 
p/(cN) = const., denoted with Lim, yielding the following simplifications: 

him a u = a , v — l,...,p LimC' l /N = c, i = l,...,N 

We consider only the case of one condensed pattern /i G {1, • • • ,p} and characterize the 
state of the network by two corresponding overlaps (omitting the index ji): 

m^t):=-Lim(^S l (t)) l G [0,1] 

(X 

mi (t) := Lim ((1 - - Si(t)))i e [0, 1] 
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With the two observables one can easily get the current network activity A(t): 

A(t):=Lim{S i (t)) l = am^t) + (l-a){l-m i {t)) E [0,1] 

The average overlaps with a noncondensed pattern v ^ \l are m^ u (t) = A(t) and 
mi v (t) = 1 — A(t). As pattern \i is condensed, m-f resp. m\ have to be substentially 
bigger than these values and the case m-f = A, = 1 — A, which is equivalent with 
m-f + mi = I; corresponds to the failure of retrieval. If the system is in a state with 
m^it) +m|(£) = 1 we have m^(t+ 1) + 77i|(t + 1) = 1, which can easily be seen from the 
evolution equation (H) derived later. Hence an uncorrelated state will stay uncorrelated. 
We will often consider the network in a state where A(t) = a. Given m-f this is achieved 
by setting: 

m l(t) = m u (t) : = 1 - r^Taj i 1 ~ m TW) 

To model the time evolution of the network we calculate the mean and the variance of 
the local field hi(t) for £f = and £f = 1, by splitting it in signal part hf and noise 
part hf as usual: 

hi = Lim - am - + - hi 



The indication of the time dependence is omitted. Now we average over the random 
distributions of the patterns ^and the state of the network S in the thermodynamic 
limit, fixing the value of £f . Due to self-averaging the signal part has a concrete value, 
/i| resp. but the noise part is a gaussian random variable with mean and variance 



a 2 : 



Hi = Lira ((hi\$ = l))^ ij¥iiS = (1 - a)(m T + m l - 1) 
Hi = Lira ((h^ = 0))^ d ^ s = - a(m T +m l -l) 

a 2 = Lira((hf)) e ^ s = aA (4) 

The two random variables hi\^ =01 — Q have mean Hi ~ Q resp. H] ~ Q-> standard 
deviation a and the distribution functions 

I \ 1 -(x- hi + Q) 2 , v 

PtW = — 7^= ex P ^T~2 P| (x) resp. (5) 



With the dynamical equations we can compute m-f and mi for the next time step 

by averaging the activation function over the distribution of the hi — Q. 



m T (t + 1) = - Q, 13)) = / ^(x, /?)p T (*) dx 

1 J —oo 

m l (t + l) = (l-g(h i (t)-Q,/3)) = I (l-g(x,(3))pi(x)dx (6) 



For the noiseless case this simplifies to: 



m T (t + l)= / pi(x)dx = Ul + erf(^J- 
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In the calculations above we considered the neurons to be IIDRVs with activity A. 
This can be realized in the first time step but after one iteration the neurons may be 
correlated due to feedback loops. In the limit of strong dilution, so that the neurons do 
not have common ancestors in former time steps, the description is valid for the whole 
retrieval process. To achieve this the number of connections at each neuron C % has to 
be of order In N, which includes c ~ N^ 1 In N — > for iV — > oo. For further details see 



|i0| , |lTfl and references within there. 

There are l/(aN) corrections to a and which become relevant in finite systems with 
small a. Therefore we will only present simulations for relativly high a, which is not a 
limitation of the theory. 



2. Dynamical properties at zero temperature 

In the following we look at the one step dynamics and get conditions on the pattern 
loading and the threshold for improvement of and at T = 0. We set 
Am| = mj(t + 1) — m^(t) > and Am| > in equation (|7|) and get (inverf(x) is 
the inverse errorfunction erf^~ l \x)): 

' > c-f := \pl inverf{2m^ — 1) 



a 



a 



> q := v^2 inverf(2mi — 1) (8) 



2.1. Critical storage capacity and mutual information 

By adding the two equations (H) we get the condition ^ — ^ > cr(cj + q) for 
improvement, which has to be satisfied independent of the choice of Q, assuming that 
it is chosen optimally. With the expressions for //-[•, /i^ and a from equation (|) we can 
solve for the critical value ct c where both improvements are zero: 
(m ] +m i - If 
ac= (ct + cJM fOT ™T + -^1 

1 2/ 

= e" c t /2 for m t + mi = 1 (9) 

From the inequality we get the following conditions for improvement of m-j- and m^. 

a < a c if m-f + > 1 a > a c if m-f + < 1 (10) 

This is the case because we have c-f + cj_ < for m-f + mj < 1 and in this region we can 
have improvement for arbitrary high pattern loadings. So if a is too high the network 
always develops towards a state uncorrelated with the retrieval pattern, where we have 
m-f + mi = 1. 

In contrast to the usual storage capacity in equilibrium, a c is a dynamical variable, 
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depending on m-j- and mj_, as well as on a and c. 



Dependence on network connectivity 

As we denned a = p/(cN) we see that the maximal number of storable patterns 
decreases proportional to the number of connections per neuron, because a c is 
independent on c. This is in accordance with former studies of random dilution in 
equilibrium (e.g. ||). 

Dependence on pattern activity 

For decreasing a the capacity increases monotonically if m-f + > 1 and decreases 
if TTif + m| < 1. If and m| are fixed we get the finite value a\ a= o = ^^+^(1-1%^ ■ 
When the activity of the network is equal to the one of the stored patterns we have 
A = a, mj = mi A and therefore C|J mi = \/2inverf[l — fr^(l — m ])]- By using 
the approximation inverf(x) ~ (— ln(l — x)) 1 ^ 2 for small x (see |14|])) we get the 
leading behaviour of a c for the limit a — > 0: 

(mt +mi . - l) 2 m, 2 c 

' 1 - m T < 1 - 2a (11) 



with 



( c lUi A ) 2a -2a In a 

This is known from former studies ||, [^, 0, ^ as an approximation to the equilibrium 
storage capacity for small a and resembles the upper bound by Gardner ||. 
Dependence on state of the network 

If one of the two parameters, m^ and m^, is equal to 1 or 0, a c is zero, because for 
m^rrii 1 we have cj, cj, — > oo. This means we can have perfect retrieval only for 
a finite number of patterns. 



For very small pattern activities the storage capacity may increase but the number of 
active neurons and the information represented by a single pattern decreases. Therefore 
the maximal amount of information storable in the network gives a more sensible 
characterization of performance. To account for the loss of information due to retrieval 
errors, i.e. m^,m± ^ 1, one uses the mutual information []12~ |. It can be defined as the 



Lira 



N 



—Lira — In 
N 



negative logarithm of the conditional probability of choosing a network state with ra^ 
and m|, given the activity A: 

aN \ ( (1 -a)N \ I ( N N 

m T aiV ) \ (1 - m;)(l - a)N ) / \ AN , 
= a(m| lnm.| + (1 — m-f) ln(l — ra^)) — (1 — A) ln(l — A) — 
— v4hiv4 + (1 — a)(mi lnm^ + (1 — m^) ln(l — mi)) 

This is the average amount of information per neuron of the retrieval pattern, that 
can be obtained from the network in a given state m^, m±. For the maximal mutual 
information content i m per synapse in units of bits we get: 



Lira I„ 



Pr, 



ciV 2 In 2 cln2 



{a 



rrit s 
m-j- In — — + (1 — 



In- 



ra 



+ (1 - a) 



mi In ■ 



mi 
1 - A 



1 

+ (1 



.4 



mjln 



1 — m 
~A 



I 



(12) 
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This gives the maximal amount of information per synapse storable in the network, sub- 
ject to the constraint of improvement of m.f and m^. It has more reasonable properties 
for low pattern activity than a c : 



If the network is in a state uncorrelated with the retrieval pattern, i.e. mf+mf = 1, 

it is ira\m^+m^=l 0. 

We have i m | 0=0,1 = f° r an y fixed m.f and mj, with A 7^ a, as there is certainly no 
information stored when all pattern elements have the same value. 

For m| = m^ A or A = a, using the approximation of a c for small a, we get: 

2 3 

777fT Tflt 

' am-flna) — > — - — for a — > 0, m.f <C 1 — 2a (13) 



-2alnaln2 v ' 21n2 

That means if we keep the network activity fixed to a during the retrieval process we 
can obtain a nonzero amount of information from the network for a — > 0, although 
the information content of a single pattern vanishes. This result is in accordance 
with former studies in equilibrium (see e.g. [[/], §]). 

The rederivation of a c and i m in the limit a — > shows that the enhanced properties 
for a — > are only given during the retrieval process if we have A = a. Further studies 
showed that this fixed relation between m.f and m; is not optimal for maximizing the 
storage capacity. The optimal relation can be found, but the resulting capacity is of the 
same order, as the maximum according to Gardner is already reached. 

2.2. Involving the threshold 

Now we look at the two conditions (|8|) separately. After inserting a = \JaA from @ 
we can solve for aif resp. qj_ where Amj = resp. Am^ = 0: 

a, = ^~ Q)2 a, = ^'^ 2 (14) 

T <*A 1 cjA K ' 

Because the conditions were squared only one branch of the parabolas a^{Q) and a^(Q) 
imposes a condition on a, depending on the values of m-f and m;. For m-f > 0.5 we 
have c-f > and the condition for improvement of m-f (§) gives an upper bound on a. 
For m-f < 0.5 the sign of is changed and therefore we have a lower bound on a for 
improvement of m-f: 

_ _ a < at for Q < /it a > at for Q > /i T 

m, > 0.5 =>- ,1 m t < 0.5 => 'I (15) 

1 a = for Q > /i T 1 a > for Q < /i T v ; 

For m-f = 0.5 the parabola reduces to a vertical line at Q — //f and we have improvement 
if Q < /i| for unlimited pattern loading and no improvement for Q > /if. The conditions 
for improvement of m^ can be obtain in the same way. 

For good retrieval qualities we would like to have improvement of both overlaps, which 
is ensured in a region of the (Q, a)-plane limited by the valid branches of «f and a^ 
(see figure |T]). Hence we have a lower and upper bound for the region of the threshold, 
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Q G + CjO", /i| — C|u] , that will lead to improvements on m-p and m; depending on 
a. Unless = m| = 0.5 there is always an intersection of the valid branches of 
and a^, where both improvements are equal to zero. Here the interval for Q reduces to 
a single point and we have a-f = = a c . The value of Q at this point is: 

Q c = ( a ) (mt + mi — 1) for mt + mi 7^ 1 

\ c t+ c l / 

e~ c V 2 for m T + m i = l (16) 



The dependence on a is linear and for mj = = 1 it reduces to Q c = 0.5 — a which 
is known to be the optimal threshold near retrieval [7] . For m-\ + m j_ > 1 the region 
of simultaneous improvement is bounded below and above and the maximal possible 
storage is a c . For m-j- + < 1 the region is only bounded below and a c is the mini- 
mal possible pattern loading for improvement, in accordance with (10). In figure ^| our 
considerations are confirmed by simulations for a = 0.3, m-j- = mj_ = 0.9 and m-f = 0.3, 
m; = 0.9. 



To illustrate the use of the derived restrictions in the (Q, a)-plane we discuss some 
choices of thresholds as functions of the state of the network which ensure Q\ a=olc = Q c , 
i.e. they allow for the critical storage capacity a c (see figure 



• Critical threshold 

The easiest thing is of course to choose Q = Q c which certainly meets the 
requirement. But if m-p resp. < 0.5 it does not improve if a < a c (see m-f 
in figure §). This choice of Q maximizes the term Acf + Acj_. 
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Q 



-0.2 0.2 0.4 



Figure 2. Simulations for a-f(Q) and a[(Q) with a = 0.3, T = and m-f = mj. = 0.9 
- (Data filled), m-f = 0.3, = 0.9 - - (Data unfilled). Data obtained with 
N = 600 and average over 10 networks each with at least 150 stimuli. The average 
improvement of mj and mj, was recorded as a function of a for different values of Q. 
At a = a-f resp. Of , Awi| resp. Am^ vanishes. 

• Maximizing Arm + Am^ 

Just for comparison we will also look at Q = Q m := + /xj.) which maximizes 
Am| + AjTij. This seems to be a somehow 'natural' choice, but it only fulfills 
Qm\a=a c = Qc when m-f = m±, where we have Q m = Q c . 

• Ensuring A(t + 1) = a 

Another possibility is to choose Q = Q a in order to get the network activity 
A(t + 1) = a, ensuring an enhanced storage capacity for small a. In the retrieval 
state we also have A = a, so we can reach it with this threshold dependence. 
However the condition Q a \ a =a c = Qc is only obeyed for A(t) = a. 

• Preserving m^/m^ 

We can also choose Q = Q r to preserve the ratio m^{t + l)/mj_(t + 1) = 
m-f (t) /mi (t) = r. We only look at this choice because it ensures Q r \ a =a c = Qc, 
but if we start out with a ratio considerably different from 1 the attractor reached 
with this threshold won't be very close to the retrieval state. So the only case where 
this is really interesting is for r = 1 where we have Q r = Q c = Q m . 

In figure ||] we illustrate the four choices in the (Q, a)-plane with a = 0.1, m-f = 0.6 and 
m l = m lA = 0-956. Therefore we have Q a \ a =a c — Qc and the curve of Q m does not 
cross the critical point (Q c ,a c ). One can compare the different choices of thresholds 
by the improvements Am-f and Awij, which can easily be calculated. As m-f = 0.6 the 
upper limit for Am-f is 0.4 and for Amj, it is 1 — vn\ A = 0.044. For small a these limits 
are reached by Q c and Q a . As expected Q a gives the highest improvements because it 
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0.5 



Figure 3. Comparison of thresholds for m-f = 0.6, A = a = 0.1 and T = 0. Threshold 
choices: Q c , Q m — • — , Q a , Q r - —. 



has the best position between the two parabolas in figure [| The improvement of m-f 
resp. mi * s a rnonotonically increasing function of the distance between Q and Q\ a =a r 
resp. Q\ a =a r Qc and Q r give the highest possible storage capacity a c in any case but 
the improvements Am-j- look rather poor, especially of Q r . 

If we consider a case with = the picture is symmetric with respect to the axis 
Q = Q c and therefore Q m = Q r = Q c give the best possible improvements Am| = Am^. 
Thus the optimal choice of the threshold during the retrieval process depends on the 
situation. If one has m^(0) ~ m^fO) and a ~ 0.5 the best choice for fast retrieval is Q c . 
But in the case of small pattern activity this limits the storage capacity and it is better 
to choose Q = Q a , independent of the initial condition. 

3. Dynamics for positive temperature 

With increasing temperature the critical storage capacity, as defined in 2.1, decreases 
and we take T c as the value of T where a c = 0. The allowed region for the threshold 
reduces to a single point Q c \t=t c , which we will also calculate. After that we make an 
approximation for a-f and a[ for small temperatures. 

3.1. Critical temperature 

We look at the dynamical equation @ in the limit a — > 0, where the distribution 
functions p-f resp. pi reduce to delta functions, because a 2 ~ a: 





Dynamical properties of a neural network 



11 



. 5 



. 4 



0.3 



0.2 



. 1 



-tr 



er" 



™4 



0.2 



. 4 



. 6 



. 8 



Figure 4. Critical temperature T c (mjJ for m-f = 0.5 , m-f = 0.9 - 

m t = 0.999 - — . Data with TV = 2000, a = 0.0005 and m T = 0.5 (*, a = 0.3 and 
c = 0.5), m T = 0.9 (0, a = 0.3 and c= 1) and m T = 0.999 (□, a = 0.5 and c = 1). 

In this limit we can evaluate the integrals in the evolution equation @ and after solving 
for j3 and Q we get the critical temperature T c = (3~ l and the corresponding threshold 

Qc\t=T c - 

For m-f + m| 7^ 1: 

— 2(mf + mj p — 1) 
= ln(^-l) + ln(^-l) ( } 

/ ln(i - 1) \ 

Qc|t=t c = ( ln( _L_ 1) m ; in( _L_ 1) - a ]K+^-l) (18) 



For m| + mi = 1 ; 



T c = 2m T (l-m T ) <5 c |t=t c = m T (l - m T ) In ( — 1) 

V m T / 

The critical temperature is independent of the pattern activity and the network 
connectivity. It is easy to see that the maximum value is T c = 0.5 at m-f = = 0.5, 
and if at least one of the two observables is equal to zero or one we have T c = 0. We see 
T c (m|) in figure |]for several values of raj, confirmed by simulations which were done at 
different values of a and c to demonstrate that it is independent on these parameters. 
The critical threshold at T = T c has the same linear dependence on a as for zero 
temperature, except for an additive constant depending on m-j- and m^. Only for 
m-f, mi ~ 1 with m-f 7^ there is a considerable difference between the two, in the case 
m-f = m| both are equal for every a. This indicates that for increasing temperature the 
critical point in the (Q, a)-plane is mainly shifted to lower a with fixed Q, which is true 
to high accuracy as we will see next. 
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3.2. Expansion for small temperatures 

For T > we solve the conditions m^(t + 1) = m-\{t) and rri[{t + 1) = m^t) for a<\ and 
a± by using an approximation for low temperatures, as T c e [0, 0.5]. In the following we 
look at the first condition in order to get ctf, the same can be done for a[. After some 
algebra given in the appendix we have the following approximation which is accurate 
for T< 2 ( x 2 /(/i T -Q): 

m T (t+l) « m T (t + l)| T=o -T 2 0^e- ( w) 2 g (19) 



By looking at the numerical solutions of a^(Q) we see that the shape of the parabola 
is more or less unaltered, it is just shifted down. Therefore we make the ansatz 
af = Q!f |t=o — f(T) where the function f(T) accounts for the downshift of a-\ as an 
additional source of noise, like in ||. Now we replace a-f|r=o by a-f + f(T) in the 
dynamical equation for zero temperature (0) and expand at f(T) = to first order: 

m T (t + l)« fflt (i+ 1)| T=0 - ^=2- e- [ ^ ? Af{T) 



This term looks very similar to the expansion of the dynamical equation and by 
comparing the two we get the following approximation: 

7T 2 

a n « a T | T=0 - 7i^ 2 «ii ~ «ilT=o - IxT 2 where j t = — — (20) 

The calculation for gives the same correction to this order. We labeled the 
approximation with 1 because we can also think of another one. By assuming a quadratic 
T-dependence of / we can make directly the ansatz a-f = a-f|T=o — I2T 2 . Knowing the 
critical temperature we get the condition q; c |t=t c = «c|t=o — l^c = 0; yielding another 
approximation: 

a T2 « a T | T=0 - 72^ 2 « i2 « «||t=o - 72^ 2 

1 [ lll (^r- 1 ) + 1 "(^r- 1 )] 2 , 91 , 
where 72 = iA (21) 

In figure |5] both approximations are compared with the numerical solution for T = 0.2, 
0.3 and the case T = for m-f = 0.6 and a = A = 0.3. Therefore we have 
71 = 0.822^ > 72 = 0.679^ and the first approximation ( p0|) gives the lower 
curves, the second ([H]) the upper ones. As expected fl2lf) is better for small a, as 



T c was evaluated at a = 0. For a-^ and a^ x to be accurate we have the condition 
a ^> max(/i| — Q,Q — fi^^jT « 0.1 from above, making it better for higher storage 
levels. 

We also see that with increasing temperature the ansatz of a simple downshift of the 
parabolas a-f(Q) and a;j,(Q) becomes more and more inaccurate. But even for T = 0.3, 
which is relatively high, the approximations are pretty good. Therefore a c decreases 
proportional to T 2 for small temperatures, which is in accordance with former studies 
in equilibrium (see e.g. ||T|). From the l/A-dependence of 71 and 72 we know that the 
effect of positive temperature is much more drastic for small network activities than for 
large ones. 
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4. Conclusion 

In this work we described a randomly diluted neural network model with variable pat- 
tern activity using a probabilistic approach to solve the one step dynamics with one 
condensed pattern. By carrying out the analysis as exactly as possible we were able to 
confirm with this relatively simple approach many previous results on this model, de- 
rived with different techniques and often restricted to special cases. Most of our results 
are valid for the whole range of the different parameters, except for the dilution level, 
generalizing some former studies. 

We got new insight in the dynamical properties of the network, studying the critical stor- 
age capacity, information content and critical temperature for arbitrary network states. 
Special focus was on the resulting constraints on the threshold to realize the critical 
values, a feature that was often overlooked in former studies. We used this to analyze 
the effects of choices of threshold functions during retrieval. We also showed that we 
have to impose conditions during the retrieval process to get the enhanced properties 
derived in former studies. 

To demonstrate the possibilities of the probabilistic approach we chose a neural network 
which is relatively simple, but has many realistic features. The analysis is not restricted 
to this model, it can be used for all networks with parallel updating. Further impor- 
tant is the site-independent description of the local field, in our case represented by /if, 
Hi and <t, excluding for example site-dependent thresholds. But the method is easily 
extendable to other models with graded response neurons (see []IU|1), groups of patterns 
with different activities, a finite number of condensed patterns or sequential patterns 
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(see 0). 

It may be interesting to extend the presented analysis to these cases, making it possible 
to describe various network properties exactly with relatively easy computations. Espe- 
cially the restrictions we derived on the parameters during retrieval may be very helpful 
for simulations. 
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Appendix. Calculation for T > 



In section 3.2 we want to calculate a-f and aj_ from the dynamical equations for T > 0. 
In the following we perform the calculation for a-f, where aj_ can be evaluated in exactly 
the same way. First we split the integral in the dynamical equation (El): 



m T (i + l) 



^ exp(2/?x) 
oo 1 + exp(2/?x 

By using the geometric series 1/(1 + y) — J2 
— x in the first integral we get: 



-pi(x) dx + 



-p](x) dx 



oo / 
n=0\ 



o 1 + exp(-2/3x) 
l) n y n for y < 1 and replacing x by 



ni 



(t+i) = j:(- i r 



n=0 



e- 2<ixn p^{x) 



e - 2 /M«+i) pT (_ x ) 



dx 



Inserting p\[x) from equation @, we have to calculate two gaussian integrals of the 
form 



e ~qix-q 2 x dx 



1 — erf 



After this we see that mj(t + 1)|t=o is the term in the sum for n = and we get the 
expression 



m T (t + l) = 
with the correction term 
A 



1 + erf 



1 oo 

^E(-i)^ 

z n=l 



e 2 ^' Q ^ n (l - erf ^/2af3n + 
_ e -2/3( M -Q)n ^ _ er j ^/2 a p n _ 



A = m T (t + l)| T=0 -A 

V2a )) 

V2a , 
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This expression is still exact but in order to do the sum we use the approximation 
1 — erf(x) = exp(—x 2 )/(^/irx) [H}|. This is accurate for x ^> 1 which means in our case 
(3 3> (/if — Q)/{2a 2 ). The expression for A simplifies to 



The sum can be done exactly and we end up with the following, writing T = (3~ l : 



m T (t + 1) = m T (t + l)| T =o - T 2 ^' - e 1 ^ J r(T7r 



2 /^-Q ( f^zg )a -Q v 



„ . . X — Sin X 7T 2 77T 4 9 

r(x) = — = — i x 2 + ... 

y ' 2x 2 sinx 12 720 

r 2 



For T 2(7 — Q) the argument of T is small and we use the zero order 

12 ' 



2 

approximation r ~ — 
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